NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The Capacity of Private Information Retrieval From Uncoded Storage Constrained Databases

https://doi.org/10.1109/TIT.2020.3023016

Attia, Mohamed Adel; Kumar, Deepak; Tandon, Ravi (November 2020, IEEE Transactions on Information Theory)
null (Ed.)
Full Text Available
Approximately Optimal Distributed Data Shuffling

https://doi.org/10.1109/ISIT.2018.8437325

Attia, Mohamed Adel; Tandon, Ravi (June 2018, 2018 IEEE International Symposium on Information Theory (ISIT))

Data shuffling between distributed workers is one of the critical steps in implementing large-scale learning algorithms. The focus of this work is to understand the fundamental trade-off between the amount of storage and the communication overhead for distributed data shuffling. We first present an information theoretic formulation for the data shuffling problem, accounting for the underlying problem parameters (i.e., number of workers, K, number of data points, N, and the available storage, S per node). Then, we derive an information theoretic lower bound on the communication overhead for data shuffling as a function of these parameters. Next, we present a novel coded communication scheme and show that the resulting communication overhead of the proposed scheme is within a multiplicative factor of at most 2 from the lower bound. Furthermore, we introduce an improved aligned coded shuffling scheme, which achieves the optimal storage vs communication trade-off for K <; 5, and further reduces the maximum multiplicative gap down to 7/6, for K ≥ 5.
more » « less
Full Text Available
The Capacity of Uncoded Storage Constrained PIR

https://doi.org/10.1109/ISIT.2018.8437729

Attia, Mohamed Adel; Kumar, Deepak; Tandon, Ravi (June 2018, 2018 IEEE International Symposium on Information Theory (ISIT))

Private information retrieval (PIR) allows a user to retrieve a desired message out of K possible messages from N databases (DBs) without revealing the identity of the desired message. In this work, we consider the problem of PIR from uncoded storage constrained DBs. Each DB has a storage capacity of μKL bits, where L is the size of each message in bits, and μ ∈ [1/N, 1] is the normalized storage. In the storage constrained PIR problem, there are two key challenges: a) construction of communication efficient schemes through storage content design at each DB that allow download efficient PIR; and b characterizing the optimal download cost via information-theoretic lower bounds. The novel aspect of this work is to characterize the optimum download cost of PIR with storage constrained DBs for any value of storage. In particular, for any (N, K), we show that the optimal tradeoff between storage (μ) and the download cost (D(μ)) is given by the lower convex hull of the pairs ([t/N](1+[1/t]+[1/(t 2 )]+...+[1/(t K-1 )])) for t = 1,2, ..., N. The main contribution of this paper is the converse proof, i.e., obtaining lower bounds on the download cost for PIR as a function of the available storage.
more » « less
Full Text Available
On the secure degrees-of-freedom of partially connected networks with no CSIT

https://doi.org/10.1109/ICC.2017.7996418

Attia, Mohamed Adel; Tandon, Ravi (May 2017, IEEE International Conference on Communications (ICC))

Full Text Available
Towards the exact rate-memory trade-off for uncoded caching with secure delivery

https://doi.org/10.1109/ALLERTON.2017.8262831

Bahrami, Mohsen; Attia, Mohamed Adel; Tandon, Ravi; Vasic, Bane (October 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton))

Full Text Available
Information Theoretic Limits of Data Shuffling for Distributed Learning

https://doi.org/10.1109/GLOCOM.2016.7841903

Attia, Mohamed Adel; Tandon, Ravi (December 2016, IEEE Globecom)

Data shuffling is one of the fundamental building blocks for distributed learning algorithms, that increases the statistical gain for each step of the learning process. In each iteration, different shuffled data points are assigned by a central node to a distributed set of workers to perform local computations, which leads to communication bottlenecks. The focus of this paper is on formalizing and understanding the fundamental information-theoretic tradeoff between storage (per worker) and the worst-case communication overhead for the data shuffling problem. We completely characterize the information theoretic tradeoff for K = 2, and K = 3 workers, for any value of storage capacity, and show that increasing the storage across workers can reduce the communication overhead by leveraging coding. We propose a novel and systematic data delivery and storage update strategy for each data shuffle iteration, which preserves the structural properties of the storage across the workers, and aids in minimizing the communication overhead in subsequent data shuffling iterations.
more » « less
Full Text Available
On the worst-case communication overhead for distributed data shuffling

https://doi.org/10.1109/ALLERTON.2016.7852338

Attia, Mohamed Adel; Tandon, Ravi (September 2016, 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton))

Distributed learning platforms for processing large scale data-sets are becoming increasingly prevalent. In typical distributed implementations, a centralized master node breaks the data-set into smaller batches for parallel processing across distributed workers to achieve speed-up and efficiency. Several computational tasks are of sequential nature, and involve multiple passes over the data. At each iteration over the data, it is common practice to randomly re-shuffle the data at the master node, assigning different batches for each worker to process. This random re-shuffling operation comes at the cost of extra communication overhead, since at each shuffle, new data points need to be delivered to the distributed workers. In this paper, we focus on characterizing the information theoretically optimal communication overhead for the distributed data shuffling problem. We propose a novel coded data delivery scheme for the case of no excess storage, where every worker can only store the assigned data batches under processing. Our scheme exploits a new type of coding opportunity and is applicable to any arbitrary shuffle, and for any number of workers. We also present information theoretic lower bounds on the minimum communication overhead for data shuffling, and show that the proposed scheme matches this lower bound for the worst-case communication overhead.
more » « less
Full Text Available

Search for: All records